Assessing colposcopy competencies in medically underserved communities: a multi-center study in China

Background Colposcopy plays an essential role in diagnosing cervical lesions and directing biopsy; however, there are few studies of the capabilities of colposcopists in medically underserved communities in China. This study aims to fill this gap by assessing colposcopists’ competencies in medically underserved communities of China. Methods Colposcopists in medically underserved communities across China were considered eligible to participate. Assessments involved presenting participants with 20 cases, each consisting of several images and various indications. Participants were asked to determine transformation zone (TZ) type, colposcopic diagnoses and to decide whether biopsy was necessary. Participants are categorized according to the number of colposcopic examinations, i.e., above or below 50 per annum. Results There were 214 participants in this study. TZ determination accuracy was 0.47 (95% CI 0.45,0.49). Accuracy for colposcopic diagnosis was 0.53 (95% CI 0.51,0.55). Decision to perform biopsies was 0.73 accurate (95% CI 0.71,0.74). Participants had 0.61 (95% CI 0.59,0.64) sensitivity and a 0.80 (95% CI 0.79,0.82) specificity for detecting high-grade lesions. Colposcopists who performed more than 50 cases were more accurate than those performed fewer across all indicators, with a higher sensitivity (0.66 vs. 0.57, p = 0.001) for detecting high-grade lesions. Conclusions In medically underserved communities of China, colposcopists appear to perform poorly at TZ identification, colposcopic diagnosis, and when deciding to biopsy. Colposcopists who undertake more than 50 colposcopies each year performed better than those who perform fewer. Therefore, colposcopic practice does improve through case exposure although there is an urgent need for further pre-professional and clinical training.


Introduction
More than 600,000 new cases of cervical cancer occur worldwide each year, with 20% occurring in mainland China [1].China is also one of few countries experiencing an increase in cervical cancer incidence [2].However, China has proactively heeded the World Health Organization's call to expedite the elimination of cervical cancer.This call established ambitious targets to attain by 2030, including a 70% screening coverage and a 90% treatment coverage [3].Colposcopy has consistently held a pivotal role in diagnosing precancerous lesions among individuals with abnormal screening outcomes, serving as a crucial guide for treatment.Therefore, the proficiency of colposcopists is a key factor in determining the overall effectiveness of cervical cancer prevention and control initiatives.
An extensive meta-analysis assessing the diagnostic capabilities of Chinese colposcopists revealed an average agreement of 68.35% when comparing colposcopic findings to histopathology results [4].Agreement in tertiary hospitals was 70.22% compared to 61.43% agreement in primary hospitals.However, it is important to acknowledge that this meta-analysis involved more tertiary hospitals, and research into colposcopic practice in medically underserved communities remains scarce.Additionally, the aforementioned study did not assess other key competencies, such as the determination of the transformation zone (TZ) and decision to perform biopsies.These aspects represent key component of colposcopists' skillset that warrant further scrutiny.
In medically underserved communities in China, many doctors who perform colposcopies are either obstetricians and gynecologists or in some instances midwives who may not have received systematic colposcopy training.Additionally, exposure to clinical cases is often limited, especially to high-grade cases.Although operational guidelines and quality control standards have been published by official organizations, it has been reported that practitioners find these standards difficult to follow [5].These factors combined contribute to the inadequate capabilities of colposcopists in these areas.This study aims to quantitatively assess colposcopists' competencies in medically underserved communities across mainland China.The findings will provide insights to improve competencies and a baseline from which to measure improvements.

Study design and population
The recruitment announcement was published on-and off-line.Competence testing was conducted to assess colposcopists' capabilities in medically underserved communities of China in September 2022.Primary colposcopists in primary and secondary hospitals in medically underserved communities were eligible to participate.Medically underserved communities are defined as specific populations that have a notable shortage of primary healthcare services or otherwise face unmet healthcare needs [6].This study was approved by the Institutional Review Board in the Chinese Academy of Medical Sciences and Peking Union Medical College (Reference: CAMS & PUMC-IEC-2022-022).All participants were required to provide informed consent before participating.

Procedure
Sociodemographic data were collected and included age, gender, ethnicity, education level, hospital level, and number of annual colposcopic examinations.The test comprised 20 cases and was designed to assess colposcopists' competencies.Participants were provided with a series of time-stamped colposcopic images, which included an initial image along with a minimum of four additional images stained with acetic acid.Alongside these images, participants were provided colposcopic indications, according to clinical practice requirements.After thorough examination and analysis of the information provided, participants were asked to determine TZ type (either I/II or III), to provide colposcopic diagnosis (either normal/benign, low-grade squamous intraepithelial lesion [LSIL], or high-grade squamous intraepithelial lesion or worse [HSIL+]), and to decide whether or not to perform biopsies.
Cases were categorized as either normal/benign, LSIL and HSIL+ based on biopsies taken during initial colposcopy.These determinations were confirmed by an expert panel.The expert panel comprised three specialist colposcopists, each possessing over 20 years of experience at interpreting colposcopic images and evaluating more than 500 cases annually.All images underwent separate assessments by experts, and in cases of disagreement, a multilateral meeting of independent assessors was held to decide.Decisions to biopsy were taken in accordance with the ASCCP (American Society for Colposcopy and Cervical Pathology) Colposcopy Standards [7].Patients with positive biopsies were classified according to the most severe pathological diagnosis, while cases with negative or without biopsies were graded through colposcopic consensus.The 'ground truth' of TZs was formulated by the expert panel in accordance with established guidelines.The TZ was categorized as type I when the entire TZ, including all upper limits, was located on the ectocervix.Type II and Type III involve endocervical components.In Type II, the upper limits of the TZ can be observed using specific devices.If the upper limits were only partially visible or completely invisible, it was classified as type III.

Outcomes
Primary outcomes were competency indicators of all responses to the 20 cases provided.Indicators included overall accuracy rates for TZ, colposcopic diagnosis, and decision to perform biopsies, as well as accuracy rates for TZ Type I/II and Type III, colposcopic diagnosed benign/normal, LSIL, and HSIL+.Analysis included rates for over diagnoses, missed diagnoses, excessive biopsies, and missed biopsies.Diagnostic performance metrics included sensitivity, specificity, positive predictive value [PPV] and negative predictive value [NPV] for HSIL+ cases.
Participants were categorized according to the number of colposcopies performed per year.50 colposcopies was set as threshold since the European Federation for Colposcopy (EFC) has set a minimum case load of 50 colposcopies per year [8].

Statistical analysis
Sample size was determined using the binominal distribution formula.Colposcopic diagnoses has been reported to be 61% accurate in Chinese primary hospitals [4].Therefore, with a confidence level of 90% and a 5% acceptable margin of error, the minimum required sample size was calculated to be 156.Sociodemographic characteristics are presented as simple numbers and percentages.The accuracy rate, over/missed rate for biopsies/diagnoses, and the diagnostic performance metrics (including sensitivity, specificity, PPV and NPV) are reported as means with corresponding 95% confidence intervals (CI).Accuracy rates between subgroups were compared using a standard t-test and diagnostic performance metrics were compared using a Chi-square [2] test.Statistical analysis was conducted using Stata (version 17.0) and R (version 5.3.0).The threshold for statistical significance was set at p < 0.05.
Overview responses from 214 colposcopists with clinical features for cases are presented in Table 4.The age of cases ranged from 25 to 51.Among them, six were diagnosed as normal/benign, eight as LSIL, and six as HSIL+.The cytology results included eight cases with no intraepithelial lesion or malignancy (NILM), five with atypical squamous cells of undetermined significance (ASC-US), two with atypical cells cannot exclude high-grade intraepithelial lesion (ASC-H), four with LSIL and one with HSIL.Five cases were HPV-negative, others were positive for at least one HPV type.

Discussion
This study examined the clinical competencies of 214 colposcopists in underserved communities in China.Overall accuracy when determining TZ was 0.47 (95% CI 0.45,0.49).Overall diagnostic accuracy was 0.53 (95% CI 0.51,0.55),with 0.61 (95% CI 0.59,0.64)sensitivity and 0.80 (95% CI 0.79,0.82)specificity in detecting HSIL+.The accuracy of decisions to biopsy was 0.73 (95% CI 0.71,0.74).Compared with colposcopists who perform at least 50 colposcopies per year, those who perform fewer were less accurate at determining the TZ, and at colposcopic diagnosis, and decision to biopsy.These results highlight substantial problems in underserved communities of China which necessitates extensive further training for aspiring and practising junior colposcopists in these communities.
Our sample of colposcopists only achieved 0.47 accuracy in determining TZ types, with a 0.22 accuracy specifically for type III TZ.TZ is the epithelium between original squamocolumnar junction (SCJ) and new SCJ, which is susceptible to HPV infection and where squamous cervical cancer originates [9].Therefore, determining TZ types helps to accurately diagnose lesions and for planning the excision range.Another study conducted in Europe reported 0.55 accuracy in TZ types [10], which was higher than we observed in this study.However, that study did not provide the distribution of TZ types or accuracy in relation to determining type.This study adds to this evidence-base highlighting significantly lower accuracy in determining type III TZ compared to type I/II TZ.Poor determination of type III TZ can lead to inadequate examinations of the cervical canal, resulting in missed endocervical lesion diagnoses.These missed diagnoses may be related to increased patient discomfort and perhaps an unwillingness (or lack of confidence) to commence a more thorough examination.That said, this finding is confirmed by Wei et al. [11] who observed a lower sensitivity for detecting HSIL+ in women with type III TZ compared to type I/II TZ.Colposcopists (like  all clinicians) gain confidence with practice and it is not unreasonable to suggest there may be a need to enhance training to ensure aspiring and junior colposcopists are more confident examining patients while communicating during potentially painful physical examinations.
Overall agreement between colposcopists' diagnoses and standardized determinations was 0.53, with a 0.61 sensitivity and 0.80 specificity in detecting HSIL+.According to a meta-analysis of Chinese primary hospitals [4], overall agreement between colposcopists and histopathology is approximately 0.61, with sensitivity and specificity for detecting HSIL+ of 0.62 and 0.91, respectively.Findings from more developed areas of China are slightly higher than those observed in this study.Accurately identifying HSIL+ is clinically important because HSIL is the action threshold for immediate treatment [12].Although, there are a number of complexities in the process, for example there are overlapping features, sampling errors, coexisting infections, and observer variability.The colposcopists involved in this study had higher accuracy in diagnosing HSIL+ compared to other lesions, though they only achieved 0.59.Participants who performed no more than 50 colposcopies were also less accurate at diagnosing HSIL+ cases compared to those who performed more than 50 cases per year.
The identification of LSIL and normal/benign cases was also relatively low, and the rate of overdiagnosis was higher than the rate of missed diagnoses.This provides some very necessary insights into the psychology of both junior and senior colposcopists, and has implications for both training and practice.At present, even senior colposcopists in underserved communities may utilize biopsies as the default assessment as this is the gold standard diagnostic test.However, biopsy should be based on colposcopy-detected lesions not as a result of practitioner uncertainties since this decision comes with a number of complications such as pain, bleeding, increased infection and scarring [13].Colposcopists lacking diagnostic capabilities often resort to biopsies or even more aggressive diagnostic procedures involving cervix excision to establish a diagnosis, which causes unnecessary physical and psychological harm to patients.This has obvious implications not least for artificial intelligence which can calculate probabilities and risk almost immediately to assist colposcopists.
Detailed analysis of each case revealed that poor diagnostic performance was due to a lack of familiarity with standard colposcopic features.Difficulty in distinguishing between normal/benign and LSIL lesions appears to be mainly due to a failure to differentiate between metaplastic squamous epithelium and thin acetowhite epithelium, especially when there is eversion of the columnar epithelium or condylomatoid lesions present.Whereas, confusion between LSIL and HSIL cases appears to be predominantly caused by a difficulty in differentiating between thin/translucent acetowhitening and thick/ dense acetowhitening.In which case, considering the borders of acetowhiteness and vascular patterns could  [14].Traditional training is difficult to develop and maintain in resource-limited settings owing to the high cost of venues, personnel, and transportation [15,16].Digital platforms could therefore supplement traditional approaches, enabling a larger number of colposcopists to be trained in an efficient and low-cost manner, thereby making the training more scalable and sustainable.In addition, such tool provides valuable case resources and expert-led instructions that are difficult to access during daily practice, which could help enhance colposcopy service capabilities in low-resource areas and narrow the gap with resource-rich regions.
In addition to training, the competences of colposcopists in underserved areas could be enhanced using AIassisted technologies.These technologies often rely on a large number of high-quality images curated and even annotated by experts which means that AI aggregates extensive experience accumulated over a long period.Therefore, AI has the potential to assist both less experienced and experienced physicians.Recently, a Colposcopic Artificial Intelligence Auxiliary Diagnostic System (CAIADS), developed by Chinese researchers, was found to have superior diagnostic sensitivity and an enhanced ability to predict biopsy sites compared to colposcopists [17,18].This raises questions about human-AI interactions because if technologies such as CAIADS can improve diagnostic accuracy and biopsy efficiency, it is necessary to embed a machine learning based reinforcement system which will require human decisions.This could reduce the subjectivity in traditional colposcopy procedures to some extent and enhance the repeatability of diagnoses.Therefore, deploying such a tool in resource-constrained settings may also improve colposcopists' competencies considered in this study.
While we assessed colposcopists' competencies in underserved communities in China, and provided an understanding of their abilities to identify TZ, make diagnoses, and take decisions to perform biopsies, this study had some drawbacks.Even though participants came from over 100 primary or secondary hospitals across 19 provinces in mainland China, the sample was relatively small.This may have caused sampling bias which may have provided an inaccurate representation.We know nothing about those who declined to participate and they may have done so for a number of reasons.Perhaps, they were simply too busy or perhaps they lacked confidence to participate which also has implications for practice in these communities.In addition, the online format of image-based testing differs from clinical reality, and there may be deviations in distribution of lesion severity compared to real-world scenarios.However, the cases included do represent a broad range and were not synthesized.Future research of this type should include more cases which may have both similarities and dissimilarities.Although, this study raised a number of additional questions about colposcopy practice in underserved communities in China, which need to be addressed for the well-being of women in these areas.
In conclusion, we observed generally poor abilities in TZ identification, colposcopic diagnosis, and biopsy determination of primary colposcopists in underserved communities of China.More experienced colposcopists, who perform more than 50 colposcopies annually are more competent than those who perform fewer than 50.This study provides a baseline to measure improvement although there is an urgent need to embed mandatory colposcopy training to ensure more case exposure for aspiring practitioners and professionals in underserved communities of China.

Table 1
Sociodemographic characteristics of colposcopists

Table 2
Accuracy rates for transformation zone, colposcopic diagnosis, and biopsy Data are presented as means with 95% confidence intervalsAbbreviations LSIL, Low-grade squamous intraepithelial lesion; HSIL+, High-grade squamous intraepithelial lesion or worse

Table 3
Diagnostic performance for detecting HSIL+ Data are presented as means with 95% confidence intervalsAbbreviations HSIL+, High-grade squamous intraepithelial lesion or worse; PPV, positive predictive value; NPV, negative predictive value

Table 4
Clinical features and response distribution for 20 test casesAbbreviations NILM, No intraepithelial lesion or malignancy; ASC-US, Atypical Squamous Cells of Undetermined Significance; ASC-H, Atypical cells cannot exclude high-grade squamous intraepithelial lesion; LSIL, Low-grade squamous intraepithelial lesion; HSIL(+), High-grade squamous intraepithelial lesion (or worse)